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PROBABILITY  AND  EXPECTATION  INEQUALITIES  1  . 
by 

Stamatis  Cambanis*  and  Cordon  Simons** 
University  of  North  Carolina 

Abstract 


This  paper  introduces  a  mathematical  framework  within  which  a  wide 
variety  of  known  and  new  inequalities  can  be  viewed  from  a  common 


perspective.  Probability  and  expectation  inequalities  of  the  following 


-V,r  1 


types  are  considered:  (a)  P(Z^A)  >  P(Z'eA)  for  some  class  of  sets  A, 

*4- 

(b)  Ek(Z)  >  Ek(Z')  for  some  class  of  functions  k,  and  (c)  E£(Z)  21  Ek(Z')  for 
some  class  of  pairs  of  functions  £  and  k.^_  It  is  shown,  sometimes  using 


explicit  constructions  of  Z  and  Z',  that,  in  several  cases,  (a)  <=>  (b)  <  =->  (c)  ; 
included  here  are  cases  of  normal  and  elliptically  contoured  distributions. 

A  case  wh^re  (a)  =*>  (b)  <=>  (c)  is  studied  and  is  expressed  in  terms  of 
"n-monotone”  functions  for  (some  of)  which  integral  representations  are 
obtained.  Also,  necessary  and  sufficient  conditions  for  (c)  are  given. 
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1.  Introduction 

There  is  an  extensive  literature  dealing  with  probability  inequalities  of 
the  form 


P(ZeA)  >  PfZ'cA)  ,  A  e  A  ,  (1.1) 

and  expectation  inequalities  of  the  form 

Ek(Z)  >  Ek(Z')  ,  k  €  Fj  ,  (1.2) 

where  Z  and  Z'  are  random  vectors  (or  more  general  random  elements)  with  common 
range  space  R,  A  is  a  class  of  (Borel)  subsets  of  R,  and  is  a  class  of  real 
(measurable)  functions  on  R.  Here  we  also  focus  attention  on  expectation 
inequalities  of  the  form 

mi)  5  Ek(Z')  ,  (p,k)  e  F2  ,  (1.3) 

where  F,  is  a  class  of  pairs  of  (measurable)  functions  on  R.  When  the  classes 
A,Fj,F,  are  progressively  richer  then  conditions  (1 . 1) ,  (1 . 2) , (1 . 3)  are 
progressively  stronger.  Specifically  if  1^  e  Fj ,  i.e.  if  1^  e  F^  for  all  A  c  A, 

then  (1.1)  <=  (1.2);  and  if  Fj  c  {k:  (k,k)  e  F2)  =:  F21  then  (1.2)  <-  (1.3). 

The  more  interesting  conclusions  are  therefore  those  which  lead  from  (1.1)  to 
(1.2)  to  (1.3). 

The  first  question  of  interest  is  of  course  to  describe  conditions  on  the 
distributions  of  Z  and  Z'  which  guarantee  (1.1)  for  specific  classes  A  of  sets, 
and  there  is  a  vast  literature  on  this.  The  second  question  is,  given  a  class 
of  sets  A,  to  describe  a  class  of  functions  F^,  depending  of  course  on  A,  for 

which  (1.1)  **>  (1.2);  if  such  a  class  F^  contains  1^  then  in  fact  (1.1)  <  >  (1.2). 

The  third  question  is,  given  a  class  of  functions  Fj,  to  describe  a  class  F 2  of 
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pairs  of  functions  for  which  (1.2)  :=>  (1.3);  and  if  furthermore  F^  c  F^j 
then  (1.2)  <=>  (1.3).  Clearly  this  equivalence  holds  for  the  class  F0 
defined  by  what  may  be  called  the  "separation  approach": 

^2  =  A  -  m  -  k  for  some  m  e  F  , 

and  the  expectations  in  (1.3)  are  defined}  .  (I-4) 

(This  approach  is  most  useful  when  there  is  a  simple  direct  description  of 
F 2>  one  that  does  not  predicate  the  existence  of  quantities  with  certain 
properties.)  When  positive  answers  to  the  second  and  third  questions  are 
feasible,  one  of  the  following  relationships  will  follow: 

i.  (1.1)  >  (1.2)  >  (1.3) 

ii.  (1.1)  >  (1.2)  <  >  (1.3)  d.5) 

iii.  (1.1)  <  >  (1.2)  <->  (1.3)  . 

An  interesting  example  of  (1.5. iii)  is  described  by  Kemperman  [5]. 

Suppose  that  R  is  a  partially  ordered  space  and  that  (1.1)  holds  for  the  class 
A  of  all  measurable  increasing  sets  A  (i.e.  a  e  A  and  a  <  b,  in  the  sense  of 
the  partial  ordering,  imply  b  c  A) .  Then,  by  considering  simple  function 
approximations,  one  obtains  (1.1)  <=>  (1.2)  where  F^  is  the  class  of  all 
measurable  increasing  functions  k  (in  the  sense  of  the  partial  ordering)  for 
which  the  expectations  in  (1.2)  are  defined.  Using  the  separation  approach  one 
also  obtains  (1.2)  <  >  (1.3),  where  F^  is  defined  by  (1.4)  and  has  the 
alternative  direct  description  as  the  class  of  all  pairs  of  functions  (£,k) 
satisfying 

k(x)  <  ?.(y)  ,  x  <  y  ,  (1.6) 


and  Cor  which  the  expectations  in  (1.3)  are  defined,  provided  the  "separating" 
increasing  function  m  defined  for  instance  by 

m(y)  =  sup{k(x) , x<y } 

is  measurable,  as  is  the  case  when  R  is  the  real  line.  (We  infer  from  a 
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comment  made  by  Kemperman  [5]  that  measurability  of  m  may  fail  even  in  ]R  .) 

New  examples  of  (1.5.iii)  are  described  in  Section  3  for  bivariate  random 
variables  with  normal  distributions  (Theorem  3.1)  and  with  certain  elliptically 
contoured  distributions  (Theorem  3.2). 

An  interesting  example  of  (1 . S . ii)  is  described  in  Section  2  when  the 
range  of  Z  =  (X,Y)  and  Z '  =  (X',Y')  is  the  real  plane.  If  A  is  the  class  of 
all  closed  symmetric  rectangles  then  (1.1)  implies  that 

Eh(X2+Y2}  >  Eh(X' 2+Y*  2)  (1.7) 

for  all  nonincreasing,  convex  functions  h  on  the  positive  half  line;  i.e. 

(1.1)  =>  (1.2)  where  is  the  class  of  all  functions  k(x,y)  of  the  form 

2  2 

h(x  +y  )  with  h  as  above.  It  should  be  pointed  out  that  (1.1)  no  longer 

implies  (1.2)  if  h  is  either  not  nonincreasing  or  not  convex;  convexity  would 

2  2 

be  unnecessary  if  (1.1)  implied  that  X  +Y  is  stochastically  larger  than 
2  2 

X'  +Y'  ,  which  is  not  true  in  general.  In  order  to  use  the  separation 
approach,  we  note  that  functions  f  and  g  on  the  positive  real  line  can  be 
separated  by  a  convex  function  h, 

f^hsg  ,  (1.8) 


if  and  only  if 


f [As  +  (l-A)t]  s  Ag(s)  +  (l-A)g(t)  , 


s  <  t  , 


0  <;  A  <  1  , 


(1.9) 


and  then  the  convex  separating  function  h  can  be  defined  (not  necessarily 
uniquely)  by 


h(u)  =  infj~-  g(s)  +  —■  g(t),  s  <  u  •-  t|  .  (1.10) 

Also,  this  choice  of  h,  or  some  simple  modification  of  it,  is  nonincreasing  if 
and  only  if 


f(t)  <  g(s)  ,  s  <  t  .  (1.11) 

Consequently  (1.1)  implies 

Eg(X2+Y2)  >  Ef(X'2+Y'2)  (1.12) 

for  all  functions  f  and  g  satisfying  (1.9)  and  (1.11)  and  such  that  the 

expectations  in  (1.12)  are  defined;  i.e.  (1.1)  “>  (1.2)  <->  (1.3)  where  F2  is 

2  7  2  2 

the  class  of  all  pairs  of  functions  £(x,y)=g(x  +y~) ,k(x,y)=f (x  +y  )  where  f  and 
g  are  described  above.  Again,  in  this  case,  the  class  F^,  originally 
introduced  via  the  separation  approach,  has  a  direct  description.  Section  2 
includes  additional  implications  of  the  form  (1.1)  =>  (1.2),  which  are 
described  for  n-dimensional  vectors  Z  and  Z'  (n  £  2) ,  and  also  integral 
representations  for  certain  "n-monotone"  functions  which  may  be  of 
independent  interest. 

Kemperman  [5]  also  describes  an  alternative  approach  based  on  a  theorem  of 
Strassen  [10]  which  guarantees  that  when  R  is  a  partially  ordered  complete 
separable  metric  space  and  A  is  the  class  of  all  measurable  increasing  sets, 
then  (1.1)  is  equivalent  to  the  following: 

There  exist  two  random  variables  Zq, Z^  with  the  same  marginal 
distributions  as  Z,Z'  and  such  that  ZQ  >  Z^  a.s. 


(1.13) 
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It  is  then  immediate  that  (1.13)  >  (1.3)  >  (1.2)  >  (1.1)  where  Fj,F2  are 

defined  in  the  paragraph  describing  Kemperman's  example,  and  thus 
(1.13)  <  >  (1.1)  <  >  (1.2)  <  •>  (1.3).  It  turns  out  that  this  use  of  surrogate 
random  variables  with  certain  specified  a.s.  properties  (cf.  (1.13)),  the 
"surrogate  approach,"  provides  a  necessary  and  sufficient  condition  for 
expectation  inequalities  of  the  type  (1.3)  in  cases  where  no  other  approach 
seems  to  work  (including  the  separation  approach)  and  even  in  cases  where  no 
useful  necessary  and  sufficient  condition  for  (1.3)  of  the  type  (1.1)  can  be 
found.  We  illustrate  the  usage  of  this  surrogate  approach  in  a  case  treated  in 
Section  3. 

Let  Z  =  (X,Y)  and  Z'  =  (X',Y')  be  two-dimensional  random  vectors  and  A  the 

2 

class  of  all  principal  lower  and  upper  ideals  in  R  ,  i.e.  all  rectangles  of 
the  forms  (-°°, x]  x(-°°,y]  and  [x,«>)  x  [y ,«)  .  Then  (1.1)  is  equivalent  to  saying 
that  Z  and  Z*  have  common  marginal  distributions  and  that  (1.1)  holds  for  all 
principal  lower  ideals  (-°°,x]  *(-<»,y] .  It  is  shown  in  [2]  and  [12]  that 
(1.1)  <  (1.2)  where  is  the  class  of  all  quasi-monotone  functions  k,  i.e. 

functions  k  which  satisfy  the  inequalities 

kUj.yj)  ♦  k(x2,y2)  >  k(Xj,y2)  k(x2,yJ)  ,  Xj  <  x2  ,  yi  “  ^2  ’  ^ 

for  which  the  expectations  in  (1.2)  are  defined  and  which  satisfy  some  minor 
regularity  conditions.  It  is  not  completely  clear  how  the  quasi-monotonicity 
condition  (1.14)  should  be  modified  in  order  to  derive  inequalities  of  type 
(1.3).  The  separation  approach  would  yield  (l.S.iii)  with  defined  by  (1.4) 
as  the  class  of  all  pairs  of  functions  £,k  which  are  separated  by  a 
quasi -monotone  function.  It  then  follows  that 

Uxj.yj)  +  «.(x2,y2)  >  k(x1,y2)  +  k(x2,y1)  ,  Xj  <  x2  ,  yi  -  y2  ’  (1 


but  we  have  been  unable  to  find  a  direct  description  of  the  class  F0  defined 
through  the  separation  approach.  Condition  (1.15)  is  necessary  but  not 
sufficient  for  £,k  to  be  separated  by  a  quasi-monotone  function,  as  shown  by  an 
example  in  Section  3.  Another  example  shows  that  (1.1)  does  not  imply  (1.3) 
for  all  functions  k  and  £  satisfying  (1.15)  (and  for  which  the  expectations  in 
(1.3)  are  defined).  One  could  conceivably  require  (P,k)  e  F2  to  satisfy 
additional  inequalities  which  are  in  the  same  spirit  as  (1.15).  For  instance, 
if  the  additional  inequalities 

iKXj.yp  +  «.(x2,y3)  +  S>(y3,x2)  >  k(xx,y3)  +  k(x2,y2)  +  kUj.y^  , 

xx  <  x2  <  x3  ,  /i  *  y2  *  y3  * 

do  not  hold,  it  is  possible  to  construct  examples  where  (1.1)  holds  but  (1.3) 
fails.  But  even  these  inequalities  are  insufficient  and  we  have  failed  to 
obtain  usable  conditions  describing  the  class  F2  by  continuing  with  this 
approach.  An  alternative  approach  is  to  assume  that  k  and  £  satisfy  no  more 
than  (1.15),  i.e.  to  define  F2  as  the  class  of  all  pairs  of  functions  k  and  £ 
which  satisfy  (1.15),  and  for  which  the  expectations  in  (1.3)  are  defined,  and 
to  impose  additional  assumptions  on  Z  and  Z',  i.e.  to  strengthen  condition 
(1.1).  This  can  be  best  achieved  through  a  variation  of  the  surrogate  approach. 
To  this  end  consider  the  following  condition: 

(CO)  There  exists  a  four-dimensional  random  vector  (Xj,X2,Yj,Y2)  whose 
values  are  in  the  set  F  =  {(x  ,x2>y1,y2)  =  (x^x^  (y^y^  s 
and  whose  bivariate  marginals  ^or 

(X1,Y1),(X1,Y2),(X2,Y1),(X2,Y2)  respectively  satisfy 

F11  +  F22  =  an<*  F12  +  F21  =  2H*’  w*iere  H  anc*  H'  are  t*ie 
distribution  functions  of  Z  and  Z'  respectively. 

When  (1.15)  holds,  condition  (CO)  implies 
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»  (X] ,  Y  1  )  v  .■(X„Y2)  •  kfXj.Y,)  ♦  kfY^Vj)  a.s.  (l-I'O 

which,  upon  taking  expectations  (assuming  they  are  defined),  yields  (1..3J. 

Hence  (CO)  ■  (1.3).  It  is  shown  in  Section  3  that  when  Z  and  Z'  are 
normally  distributed  with  common  means  and  variances  then 

p  ■  o’  -  '■  (CO)  •'  >  (l.D  ''  >  (1.2)  (1.3)  , 

where  p(p')  denotes  the  correlation  between  the  components  of  Z(Z').  In 
Section  3,  a  similar  result  is  obtained  when  Z  and  Z'  have  elliptical!)- 
contoured  distributions;  and  also  a  generalization  r ror.i  two  to  higher 
dimensions  is  described. 

For  the  example  of  the  preceding  paragraph,  as  was  mentioned,  (1.1  l  docs 
not  imply  (1.3)  in  general.  It  is  shown  in  Section  4  that 

(CO)  <  >  (1.1)'  <"  >  (1.3),  where  is  the  class  of  all  pairs  of  Borel  sets  A 
and  A'  in  the  plane  which  are  such  that  the  functions  and  1^,  satisfy  (1.13, 
and 

P(ZeA)  >  P(Z'eA')  ,  (A, A')  c  A,  .  (l.T' 

Using  a  generalization  of  a  theorem  by  Strassen  [10]  we  obtain  in  Section  4 

several  further  results  of  the  type  (CO)  <“>  (1.1)’  <  >  (1.3),  where  (1.1)  1  is 

stronger  than  (1.1).  Among  these  the  following  is  related  to  the  inequalities 

2 

of  Section  2  described  earlier.  When  R  =  R  ,  A  is  the  class  of  all  closed 
symmetric  rectangles;  is  the  class  of  all  functions  k  which  satisfy 

kUj.yj)  <  k(x2,y2)  ,  |x2|  <  IxJ  or  |y2|  <  lyj  , 


w  -  a 


3CXS 
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and  for  which  the  expectations  in  (1.2)  are  defined;  and  is  the  class  of  all 
pairs  of  functions  V.  and  k  which  satisfy 

t/.(x2,y2)  ■;  kCXj.yj)  ,  |x2|  '  |xj  |  or  j y2 1  -  \y}  \  , 

and  for  which  the  expectations  in  (1.3)  are  defined;  it  is  shown  in  Section  4 

that  (1.1)  <  >  (1.2)  <=>  (1.3).  It  should  be  noted  that  the  functions 
2  2 

k(x,y)  =  f (x  +y  ),  with  f  nonincreasing  and  convex,  considered  in  Section  2,  do 
not  belong  to  the  class  F^ .  Finally  Section  4  derives  several  new  inequalities 
of  the  type  (1.1)  and  (1.3)  for  normal  and  elliptically  contoured  distributions. 


2.  n-monotone  functions 

In  this  section  we  develop  inequalities  for  expectations  of  n-monotone 
functions  (to  be  defined  below)  of  the  squares  of  the  moduli  of  n-dimensional 
random  vectors.  We  begin  with  the  case  n  =  2  (Theorem  2.1)  and  then  proceed  to 
the  general  case  n  >  2  (Theorem  2.2).  In  the  process  of  establishing 
Theorem  2.2  we  develop  an  integral  representation  for  certain  n-monotone 
functions  (Lemma  2.3)  which  may  be  of  independent  interest. 

Theorem  2.1.  Suppose  Z  =  (X,Y)  and  Z'  =  (X',Y')  are  bivariate  random  vectors 
for  which 

P(  |  X  |  <  a,  |  Y  |  <  b)  >  P( !  X' j  <  a ,  (Y '  |  <  b)  a>0J  b>0.  (2.1) 
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Proof.  Condition  (2.1)  is  equivalent  to  saying  that  aX'  v  bY'  is 

•>  2 

stochastically  larger  than  aX“  v  hY  for  a  >  0,  h  •  0,  where  u  v  v  denotes  the 
maximum  of  u  and  v.  Thus  for  any  bounded  nonincreasing  function  h  on  1 0 ,"■> , 


Eh(aX2vbY2)  >  Eh(aX' 2v  bY ' 2)  ,  a  >  0  ,  b  >  0  ,  (2.3) 


and,  consequently. 


E/oh 


2  2 
cos  0  sin  0 


sin 


0  cos  0  d0  >  £  fn  h 


.,2 


2  2 
cos  0  sm  O 


sin  0  cos  --  d 
(2.4) 


2  2 

2  2  X  y 

Now  with  the  substitution  of  (x  +y  )u  for  - ^ —  v  — ^ —  ,  the  integral 


It" 


2  2 
X  v 


2  Q  '  .2 

cos  0  sin 


2  '  2 
cos  0  sin  0 


sin  0  cos  0  d0  simplifies  to 


1  r  h((2L+yhu).  du 

2  J 1  2  au 

u 


Thus  (2.2)  holds  for  functions  f  of  the  form  f(s)  =  du,  s  >  0.  But, 

u^ 

according  to  the  lemma  below,  the  class  of  such  functions  coincides  with  the 
class  of  bounded  nonincreasing  convex  functions.  The  unwanted  restriction  of 
boundedness  is  easily  removed  by  truncation:  If  f  is  any  nonincreasing  convex 
function,  then  f  v  (-n)  is  a  bounded  nonincreasing  convex  function  whose  limit, 
as  n  +  “>,  is  f.  Then  (2.2)  follows  by  means  of  the  monotone  convergence 
theorem .  ~ 

Lemma  2,1.  The  class  of  bounded  nonincreasing  convex  functions  on  [0,°°)  and 

the  class  of  functions  f  of  the  form  f(s)  =  Jj  du,  s  >  0,  with  h 

u 

nonincreasing  and  bounded  on  [0,»),  coincide. 


BB 
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Proof.  Suppose  f(s)  =  du,  s  >  0,  with  h  nonincreasing  and  bounded  on 

u 

fO,-) .  Quite  obviously  f  is  nonincreasing  and  bounded.  To  see  that  f  is 

convex,  observe  that  f(s)  =  s  f  dv  for  s  >  0  and  f(0)  =  h(0).  Thus  for 

v~ 

0  <  s  <  t. 


f(s)  +  f(t)  -  2f 


s+t 


i: 


h(v) -h 


s  2 

v 


s+t 


h(v) -h 


.hfii 


dv.t/; - 


dv 


h(v)-hfe^- 

-  (s+t)  f° - V-i.  .2_i.  dv 

s-t  t  V 

2 


s+t 


h(v) -h 


=  s 


s+t 


dv  +  t 

s+t 

2 


s+t 


-h(v) 


dv  >  0 


The  argument  when  s  =  0  is  similar.  Consequently,  f  is  convex  on  [0,®). 
Conversely,  suppose  f  is  a  bounded  nonincreasing  convex  function  on  [0,®) . 
Define  a  nonincreasing  function  h  on  [0,«0  as  follows: 


h(s)  =  f (s)  -  sf^(s)  for  s  >  0  , 
=  f (0)  for  s  -  0  , 


where  f  (s)  denotes,  for  definiteness,  the  smallest  slope  among  all  tangent 
lines  to  f  at  s.  (Thus,  when  f  is  differentiable  at  s  >  0,  f°(s)  =  f'(s).) 
Then  for  fixed  n  >  1  and  s  >  0, 


f(s(l  +  i))-f(s) 

f(s)  -  s  - ^ -  <  h(s)  <  f(s) 


f(s)-f(s(l-i)) 

s/n 


due  to  the  convexity  of  f.  Thus 
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(n+l)f (s)  -  nf (s (1  +  i) )  <  h ( s )  <  nf(s(l-i)j  -  (n-l)f(s)  . 


But 


/ 


«o 

1 


(n+1) f (su)  -nf  (su(l+  pj-)) 

2 

u 


du 


(n+1) 


1+- 

/  n  du  -*■  f(s)  as  n 

1  u 


OO 


since  f  is  continuous  at  every  point  s  >  0,  and  likewise 
nf  (su(l--j-))-  (n-l)f  (su) 

- 2 -  du  f(s)  as  n  -*  co  t 

u 

from  which  it  follows  that 

f(s)  =  MlHl  du  ,  s  a  0  . 
u 

From  this  integral  it  is  easily  checked  that  h  is  bounded.  .1 

There  are  unbounded  nonincreasing  convex  functions  f  which  cannot  be 
expressed  in  the  above  integral  form,  with  h  nonincreasing  and  necessarily 
unbounded,  e.g. ,  f (s)  =  -s,  s  >  0.  An  analogue  of  Lemma  2.1  can  be  established 
for  nonincreasing  convex  functions  f  defined  on  the  open  interval  (0,°°) . 
Boundedness  is  not  essential  on  (0,1],  but  is  on  [1,°°).  See  Lemma  2.3  below. 

Likewise  Theorem  2.1  can  be  modified  to  cover  functions  f  defined  on  (0,°°) 
which  are  nonincreasing  and  convex.  Such  functions  can  be  approximated  from 
below  by  functions  of  the  type  described  in  Theorem  2.1;  and  through  use  of  the 
monotone  convergence  theorem,  we  can  obtain: 

Corollary  2.1.  If,  in  addition  to  (2.1),  P(Z=(0,0))  =  0,  then  (P(Z'=(0,0))  =  0 
and)  (2.2)  holds  for  each  nonincreasing  convex  function  f  on  (0,°°)  for  which 
the  expectations  contained  therein  exist. 
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It  is  apparent  from  the  nature  of  assumption  (2.1),  appearing  in  Theorem 
2.1,  that  inequality  (2.2)  can  be  extended  to 

Ef(aX2+6Y2)  >  Ef(aX'2+BY'2)  ,  u  '>  0  ,  3  >  0  . 

There  are,  of  course,  many  nonincreasing  convex  functions  f  to  which 
Theorem  2.1  or  its  corollary  is  applicable.  As  an  example  the  assumptions  of 

Theorem  2.1  imply  ERU  <  ER,a,  0  <  a  <  2,  where  R  =  X'  +  Y  and  R'  =  X'  +  Y'", 

ot  c x 

while  the  assumptions  of  Corollary  2.1  permit  the  conclusion  ER  :»  ER'  ,  a  <  0. 

The  value  of  Theorem  2.1  and  its  corollary  depends,  of  course,  upon  the 
reasonableness  of  assumption  (2.1),  an  inequality  of  type  (1.1).  Theorem  2.1 
of  Das  Gupta  et  al .  [3]  states  easily  checked  conditions  under  which  this 
inequality  holds  for  pairs  of  related  elliptically  contoured  distributions, 
as  well  as  conditions  under  which  assumption  (2.7)  holds  in  Theorem  2.2  below 
and  in  its  corollary. 

The  requirements  in  Theorem  2.1  that  f  be  nonincreasing  and  convex  are 
both  necessary  for  the  generality  of  the  theorem:  If  f  is  any  function  on 
[O,«0  which  satisfies  (2.2)  whenever  (2.1)  holds  and  the  expectations  make 
sense y  then  f  must  be  nonincreasing  and  convex. 

Proof.  The  need  for  f  to  be  nonincreasing  can  be  seen  by  considering 

nonstochastic  Z  and  Z'  of  the  form  (x,0)  and  (x',0),  0  r  x  <  x'  •  Now 

suppose  f  is  nonincreasing  and  satisfies  (2.2)  for  all  Z  =  (X,Y)  and 

Z'  =  (X',Y')  satisfying  (2.1).  For  s  >  0  and  p  e  (0,1],  let  Z'  =  s 2 V 
1  %  ^ 

and  2  =  s^V  ,  where  V  is  uniformly  distributed  on  the  unit  circle.  Since 

Z  and  Z'  are  elliptically  contoured  vectors  which  satisfy  (2.1)  (cf.  Theorem  2.1 
of  [3]),  inequality  (2.2)  holds,  which  translates  into 

f(s)  <  tt~*  J1  (1-u2)'  5  f(s[l+up])du  ,  s  >  0  ,  p  e  (0,1]  .  (2.5) 

-1 
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Replacing  s  by  s  -  1/n  and  letting  n  ►  <«  yields 


f(s-)  <  it-1  /*  (l-u2)~  2  f  (s  [l+up])du  ,  s  >  0 

-1 


P  f  (0,1]  , 


which,  in  turn,  due  to  the  monotonicity  of  f,  yields 

f(s-)  <  7r_1  f°  (1-u2)' 2  f (s-sp) du  +  /*  (1-u2) '  2  f  (s+) du  =  ~  f(s-sp)  +  j  f(s  +  ) 

for  s  >  0  and  p  £  (0,1].  Letting  p  4-  0,  we  obtain  f(s+)  >  f(s-)  ,  s  >  0,  which 
establishes  the  continuity  of  f  on  (0,«>)  . 

Now  suppose  f  is  not  convex  so  that  for  some  0  <  a  <  b,  we  have  f(a)  >  f(b) 

and 


f (a)  +  f  (b)  <  2f  ~ 


a+b") 

'  J 


(2.6) 


Consider  lines  t  =  ms  +  c,  a  <  s  <  b,  of  negative  slope  m  =  (f (b)-f (a))/(b-a) . 

For  large  values  of  c,  the  line  t  =  ms  +  c  >  f(s)  over  the  entire  interval 
[a,b].  Let  c  decrease  until  the  line  first  touches  the  graph  of  f  at  some 
point  in  the  interval  [a,b],  and  let  sQ  be  the  smallest  such  point  of  contact 
with  this  line.  (Since  f  is  continuous  on  (0,°°),  both  c  and  s^  are  well-defined.) 
Due  to  (2.6),  sQ  is  in  the  open  interval  (a,b) .  Setting  s  =  sQ  and 

,  so  that  0  <  p  <  1  and  a  <  s(l+uo)  <  b  for  -1  <  u  <  1,  we 


_ 

f  \ 

1-i. 

A 

K-l] 

s0j 

[s0  J 

obtain  from  inequality  (2.5): 


f(sQ)  Sir'1  / 1  (1-uV*2  f(s0[l+up])du  sir'1  /j  (1-u Vs*  (ms0[l+up]+c) 
=  msQ  +  c  =  f(s0)  . 


-1  rl 


du 


This  can  only  happen  if 

f  (Sq [1+up] )  =  ms0(l+up)  +  c 


-1  <  u  <  1  , 
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which  is  impossible  (for  negative  u)  due  to  the  way  s^  is  defined.  Thus  f  must 
be  convex. 

2  ° 

We  remark  that  the  random  variables  R  and  R'“,  associated  with  the  random 
i,  fi  V 

vectors  2  =  s 2 V  ^  and  Z’  =  s 2 V  (used  in  this  proof),  are  not  stochastically 

2  2  2 
ordered  since  ER  =  ER'  =  s.  Thus  condition  (2.1)  can  hold  without  R  being 

a 

stochastically  smaller  than  R'“.  It  follows,  of  course,  that  condition  (2.1) 
can  hold  without  (2.2)  holding  for  every  nonincreasing  function  f. 

Finally  it  should  be  pointed  out  that  the  argument  used  in  establishing 
Theorem  2.1  shows  that  inequality  (1.2)  holds  for  all  functions  k  of  the  form 


TT 


where  Fn(r)  is  jointly  measurable  in  (0,r)  and  nonincreasing  in  r  for  each 

<7 

fixed  0,  p  is  a  measure  on  the  open  interval  (0,^),  and  the  indicated  integral 

exists  and  is  finite.  By  choosing  for  instance  F^(r)  =  h(r)g(6)  with  h  bounded 

and  nonincreasing  and  g  bounded  and  >  0  (e.g.  g(0)  =  (sin  0)n  (cos  9)m)  and  p 

Lebesgue  measure,  we  can  generate  a  large  class  of  symmetric  as  well  as 

nonsymmetric  functions  k(x,y).  The  choice  g(9)  =  sin  0  cos  6  gives  Theorem  2.1 

2  2  2  2 

for  bounded  nonincreasing  convex  functions  of  x  +y  ;  k(x,y)  =  f(x  +y  ). 

Theorem  2.1  can  be  generalized  to  higher  dimensional  vectors,  and  this  is 
done  in  Theorem  2.2  where  the  following  terminology  is  used.  For  2  <  n  <  »,  a 
function  f  defined  on  [0,«>)  or  (0,°°)  is  said  to  be  n -monotone  if  its  k^  order 
divided  differences  are  of  alternating  signs  for  1  <  k  <  n,  of  nonpositive  sign 
for  odd  k  and  of  nonnegative  sign  for  even  k.  (Thus  [x^,x^;f],  defined  by 
(f (x0) -f (x j) ) /(Xq-Xj) ,  is  nonpositive  for  distinct  and  x^  in  the  domain  of 
f;  fx(),x1,x2;f]>  defined  by  ( [xQ,x1 ;  f]  -  [Xj  .x,;  f  ] )  /  (xQ-x7) ,  is  nonnegative  for 
distinct  x^,  x^  and  x^;  etc.)  It  follows  from  Theorem  A,  page  238,  of  Roberts 
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and  Varberg  [7]  that  1"  is  n-monotone  iff  (i)  it  is  nonincreasing  on  its  domain, 
(ii-)  it  is  (n-2)-times  continuously  differentiable  on  (0,')  with 


(-l)kf(U  (s)  -  0  ,  s  0  , 


. -  , 


and  (iii)  (-l)n"2f *'n~2'1  is  nonincreasing  and  convex  on  fO,®) .  For  future 

reference,  we  note  that  (iii)  is  equivalent  to:  (iii')  (-1)  f  is 

locally  absolutely  continuous  with  a  nonpositive  and  nondecreasing 

(Radon-Nikodym)  derivative  (-l)n*2f  .  A  function  f  defined  on  (0,®)  or 

(0  ,'»)  is  said  to  be  ®- monotone  if  it  is  n-monotone  for  all  n,  i.e.,  if  f  is 

nonincreasing  on  its  domain,  and  f  is  infinitely  differentiable  on  (0,®)  with 
k  k 

(-1)  f  (s)  i  0,  s  >  0,  k  >  1,  In  Lemma  2.3  an  integral  representation  is 

obtained  for  all  bounded  n-monotone  functions  defined  on  [0,®),  2  <  n  <  ®.  A 

well-known  related  notion  is  that  of  complete  monotonicity.  A  function  f 

defined  on  [0,®)  or  (0,®)  is  called  completely  monotone  if  it  is  continuous 

on  its  domain,  and  it  is  infinitely  differentiable  on  (0,®)  with 
k  (k) 

(-1)  f  (s)  >  0,  s  >  0,  k  >  0.  Thus  a  completely  monotone  function  is 
®-monotone,  and  if  f  is  ®-monotone  on  [0,®)  or  (0,®),  then  -f^  is  completely 
monotone  on  (0,®).  Completely  monotone  functions  on  [0,®)  are  Laplace 
transforms  of  finite  measures  on  [0,®),  and  completely  monotone  functions  on 
(0,®)  are  Laplace  transforms  of  (not  necessarily  finite)  measures  on  [0,®) 
for  which  the  Laplace  transform  is  finite  on  (0,®).  (See  Widder  [13].) 


Theorem 


satisfy 


2.2.  If  the  random  vectors  2  =  (Z.,...,Z  )  and  Z'  =  (ZJ , . . .  ,  Z'),  n  >  2. 
-  '  in  in 


P(|Zilsar...,!Zn|<an)  >  P  ( |  Z  ’  |  <  a, . |Z'|<an)  ,  a^O . an>0  ,  ( 
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and  if  f  an  n-monotone  function  on  |0,—),  the': 


ff 


n 


(2.8) 


Remarks.  1.  Theorem  2.2  can  be  viewed  as  an  extension  to  higher  dimensions  of 
a  well-known  result  in  one  dimension,  provided  one  interprets  a  1-monotone 
function  as  a  nonincreasing  function. 

2.  Some  examples  of  “-monotone  functions  on  [0,°°),  to  which 

—  QjS  fj ( 

Theorem  2.2  is  applicable  are:  e  (ot  >  0),  -s  (0  <  n  <  1), 

(s+a)a  (a  ‘  0,  a  N  0) ,  -log(s+a)  (a  0)  . 

3.  \n  example  of  an  n-monotone  function  which  is  not  (n+1) -monotone 
is  the  function  f  defined  by  f(s)  =  ((l-s)v0)n  ^.s^O  (n  >  2). 

Before  we  prove  Theorem  2.2,  we  must  gather  together  a  number  of  facts, 
some  of  which  we  state  in  the  form  of  lemmas. 


Lemma  2.2.  If  h  is  a  bounded  lunation  and  the  candor  vector  (V . V  ).  n  >  2. 

l  n 

is  uniformly  distributed  on  the  surface  of  the  n- dimensional  unit  svhere,  i'n-  r. 


Eh 


xf 

2] 

X 

V  -H 

1 V  V  1 

t— 

1 

rn 

2 

VI 

■  V2 

nJ 

|V1-  n' 

n/2 . 
tr  (n- 

-  h(r2u)du  ,  (2.9) 


2  o  2 

for  evenj  real  vector  (x x  ),  where  r  =  x2  +...+  x  . 

1  n  1  n 


Proof.  For  n  =  2,  (2.9)  is  established  in  the  proof  of  Theorem  2.1.  We  now 
assume  (2.9)  is  true  when  n  in  (2.9)  is  replaced  by  n-1,  and  proceed  to 
establish  its  validity  for  n  (n  >  3) .  Using  the  facts  that  Vn  has  density 
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fv  (v) 
n 


-J-2L 

n-1 

1  I  ~ - 


n-3 

(l-v“)  “ 


-1  ••  V  '  1  , 


that,  conditioned  on  V  =  v,  (l-v‘')~‘!  (V. . V  J  is  uniformly  distributed  on 

n  i  n-  i 

the  (n- 1) -dimensional  unit  sphere  (see  for  instance  Lemma  3  in  [1]),  and 

therefore  that  (by  the  induction  hypothesis)  the  conditional  expectation  of 

2  2) 

X  X 

h  -j  v. .  .v  -y  |Vr.  .Vj  given  =  v  is 


n-1 

„  2,  2  fo  (u-l)n 

v(1_v  i  i i  in 


->2  2 

,.n-3  (r'-x  )u  x 

-1)  ,  n  n  , 

— — —  h  - = —  v  —  du 

n-1  ,  2  2 

u  1-v  v 


7i  (n-3) 


we  obtain  (after  some  minor  simplifications) 


n  x.  n 

Eh  v  -i  IT  |V  | 

1  V  1  1 

x 


n  x2  n 


J  E  h  v  4  n|V. I  V  = v  f  (v)dv 


-1  1  vi  1 


(2.10) 

r  fn]  _  f.  2  2,  2 

r  1 2  (u-l)n_3  fl  (r  -VU  Xn  ,,  2,n-2  ,  , 

n/2,  h  -'n-l-  f0  h  —~T~  v  T  v(1‘v  }  dvdu 

it  (n-3)  !  u  1-v  v 


,2  2.  2 

u(r  -xn)  xn  2 

With  the  change  of  variable  v  -+  y:  - ^ —  v  ~2  =  T  y>  the  inner  integral  in 

1-v  v 

(2.10)  simplifies  to 


2r  2  «  • 

r  u(r  -x  )+xZ  u 

v  n'  n 


and  (2.10)  becomes 
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n 

rr~(n-3)  ! 


2  0 
r“y-x“ 

•  r\ 


-  2  C  n- 1 ) 


f-  h(r“y) 
n  n 


r  -x 


/,  (“-U 


n-3 


->  ■>  2 

x'(r  ,  2  2.  n- 1  ,  . 

+  (r  -x_)  rdudy  . 


n 


n- 1 


n 


u 


The  inner  integral  equals 


,2  2,  1 
X  (r  y-x  ) — ^ 
n  7  n  n-2 


1- 


2  2 
r  “xn 


2  2 
r  y-x 

J  T1 


n-2 


,  2  2. n-1  1 

♦  (r  -x  )  — ^ 

n  n-2 


r  v-x 

— : — H-! 

T  2  1 

r“-x 

n 


n-_' 


i 


>-  r“^n_1)  (y-l)n"" 


n-2 


and  thus  (2.10)  becomes 


r 

' 

n 

TTn/2(n-2) 


ij _ r 

J1 

n-2)  ! 


h(r2y)dy  . 


By  using  the  same  argument  used  in  proving  Theorem  2.1,  together  with 
Lemma  2.2,  one  can  readily  establish  (2.8)  for  functions  f  defined  on  [0,r)  of 
the  form 


f(s)  =  /*”  ^U-~~ -  h(us)du  ,  s  >  0  ,  (2.11) 

u 

where  h  is  bounded  and  nonincreasing.  The  class  of  such  functions  is 
characterized  in  Lemma  2.3,  which  follows. 

Lemma  2.3.  The  class  F^  of  functions  f  described  by  (2.11),  with  h  any  bounded 
and  nonincreasing  function  on  [0,"°),  coincides  with  the  class  of  hounded 
n-monotone  functions  on  [0,t»),  The  class  of  functions  f  of  the  form 


,  'A  >’  V 
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f(s)  -  s  h(v)dv  ,  s  >  0  ,  (2.12) 

v 

h  iinu  ntminct\:a#tno  funs!  Ion  on  ((>,•••)  ana  i  oim.  /.  «».•  [1 ,  ,  coin--:  </.-<:  .r’  1 1: 

:  !u  sliuui  of  n-mono! one  function#  on  (O,<*0  :fnioh  arc  'rounded  on  [1  ,<«)  (n  -  2,'. 

Remark.  An  immediate  consequence  of  this  lemma  is  that  the  classes  F  and  F' 
- -  n  n 

are  nonincreasing  in  n.  If  f  c  Fn  (eF^)  for  every  n»  then  f-f(“)  is  a 
completely  monotone  function  on  [0, >)  (on  (0,')),  and,  therefore,  f  must  be  of 
the  form 


f(s)  =  c  +  f”  e'SU  du(u)  ,  s  •  0  (s  >  0)  , 

where  c  is  any  real  number  and  u  is  a  finite  measure  on  [ 0 , °°)  (u  is  a  measure 
on  (0, °o)  for  which  the  integral  is  finite  for  all  s  >  0) . 


Proof.  We  shall  prove  the  characterization  of  F^ .  Since  the  right-hand  side 

of  (2.12)  is  equal  to  the  integral  in  (2.11)  when  s  -■  0,  the  stated 

characterization  of  F  is  easily  inferred  from  that  for  F' . 

n  7  n 

Suppose  f  e  F\  It  is  clear  from  (2.11)  that  f  is  bounded  on  [ 1 , ■ ) »  and 
from  (2.12)  that  f  is  (n-2)-times  continuously  differentiable  on  (0,)  with 


f(k)(s) 


(-1) 


k-i  (n-2)  1 
(n-k) ! 


(v-s) 


n-k-2 


[kv  -  (n-l)s]h(v)dv 


(2.13) 


s  >  0  , 


1  <  k  <  n 


Also  f^n  ^  is  locally  absolutely  continuous  with  a  (Radon-Nikodym)  derivative 


f(n'n(s)  =  (-l)n‘2  (n- 1) !  J”  ~~  dv  ♦  (-l)n_1  (n-2)!  s*(n_1) 


h(s)  a.e.  on  (0, 
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which,  after  integrating  by  parts,  simplifies  to  the  version  we  will  use: 

f(n_1)(s)  =  (-l)n  (n-2) !  /'"  v"(n_1)  dh(v) ,  s  >  0  .  (2.14) 

Clearly  f^n  ^  (°°)  (=lim  f^n  ^(s))  exists  and  equals  zero.  In  fact, 

S-+CO 

k_ 1  fkl 

s  Afv  J  (s)  +  0  as  s  +  “>  ,  1  <  k  <  n  -  1  .  (2.15) 

This  is  obvious  from  (2.14)  for  k  =  n  -  1  and  from  (2.13)  for  1  <  k  <  n  -  2. 
Observe  that  (-l)n  ^  is  nonincreasing.  Since  h  is  nonincreasing,  it 

follows  by  (2.14)  that  (-l)n  ^  is  nonnegative.  Proceeding  by  backwards 

induction:  From  the  nonnegativity  of  (-l)n_1f  ,  we  infer  that  (-l)n_2f 

is  nonincreasing.  Since  f^n  (<»)  =  o  (implied  by  (2.15)),  (-l)n  2f^n  ^  is 
nonnegative,  etc.  Thus  (-l)^f^  is  nonnegative  for  j  =  l,...,n-l,  and 
(-l)n  *f^n  ^  is  nonincreasing,  which  together  say  that  f  is  n-monotone. 

Conversely,  suppose  f  is  n-monotone  on  (0,«>)  and  bounded  on  [l,ro).  Define 
h  on  (0,  °°)  by 


h(v)  =  (n-1)  l  vkf  W  (v)  ,  v  >  0  ,  (2.16) 
k=0 

which  is  nonincreasing  because  each  term  in  the  sum  is  nonincreasing  (a 
consequence  of  f  being  n-monotone) .  Observe  that  h  is  of  bounded  local 
variations  and 


dh(v) 


(-I)”'1 

(n-2)! 


n-1 

v 


df (n-i) 


(v)  , 


(2.17) 


i.e., 
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h(v)  -  h(s)  = 


,  fVs  u"'1  df(n_1)(u)  ,  0  <  s  <  v  <  oo  .  (2.18) 


Then  for  s  >  0  (using  (2.18)), 


s  r  .(.V~SA -  (h(v)-h(s))dv  = 

J  s  n 
v 


C - l ) n _F  (v-s)"‘2  rv  n-1  (n-1),  ,, 

TH^rr  5  Is  - n -  L  u  df  <u>dv 

V 


(-1)  r°°  /„n“l 


—  r  {un_i  -  (u-s)n_1}df(n'I)(u)  =  .(s) 


(n-1)!  Js 


say,  where  we  have  applied  Fubini's  theorem  for  nonnegative  functions  (without 
knowing,  as  yet,  that  Fn  j(s)  is  finite).  In  what  follows,  we  shall  need  to 
use  (2.15),  which  should  be  justified  in  the  present  context.  This  is  done  in 
Lemma  2.4  below  for  the  kth  derivative  of  an  arbitrary  n-monotone  function, 

2  <  k  <  n  -  1.  The  remainder  of  (2.15),  for  k  =  1,  is  valid  in  the  present 
context  since,  by  assumption,  f  is  bounded  on  [1,«°). 

Now,  using  integration  by  parts  and  (2.15)  for  k  =  n  -  1 ,  we  obtain 


Fn-l(s)  =  -XnTT-sn'lf'"‘1)(s’  *  • 


where 


F„-2«  ’  Ti^TT  £  (u"'2  -  Cu-s)"-2>f(n-1)(u)du  . 


Proceeding  by  backwards  induction,  we  are  eventually  led  to 


V  (-I)k  „Mk) 


-  (h(v)-h(s))dv  =  f(s)  -  l  sVkj(s)  , 

v  k=0 
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which,  in  view  of  (2.16),  establishes  (2.12).  The  boundedness  of  h  on  [1,^0 
may  be  inferred  from  (2.12). 


If  f  is  n-monotone  on  [0,®)  or  (0,®),  then  (-l)kf(k)  is  nonnegative  and 
nonincreasing  on  (0.®)  for  k  =  l,...,n-l,  and,  hence. 


0  <  s(-l)kf(k)(s)  <  js  (-l)kf(k)(u)du 

s/2 

<  2{(-l)kf(k'1)(s)  -  (-l)kf(k'1)(s/2)}  , 


These  inequalities  permit  us  to  describe  the  behavior  of  the  derivatives  of  f 
as  s  -►  a>  and  s  4-  0: 

Lemma  2.4.  if  f  is  n -monotone  on  [0,®)  or  (0,®)  for  some  n  >  3,  then 


Proof, 
k  -  1  = 


k-1  (k)  ,  , 

s  t  (s)  ->•  0  as  s  ®  ,  2  <  I;  i  n  -  1  .  (2.20) 

This  follows  by  induction  from  (2.19),  provided  (corresponding  to 

1) 


fC1)(s)  -  f(1)(s/2)  -  0  . 


But  this  is  the  case  since  f^^  is 


nonpositive  and  r.ondecreasing . 


-a-  2‘5‘  Jf  f  is  "-monotone  on  [0,®),  or  n-monotone  on  (0,®)  with  f(0+) 
finite,  for  some  n  >  2,  then 


s  f ^  \s)  -*■  0  as  s  4  0 


1  <  k  <  n  -  1  . 


(2.21) 


Proof.  In  either  case,  f(0+)  exists  and  is  finite.  Thus  f(s)  -  f(s/2)  -  0  as 
s  4  0,  and  (2.21)  follows  from  (2.19)  by  induction. 
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Proof  of  Theorem  2.2.  In  view  of  the  remark  preceding  Lemma  2.3,  we  may  take 
(2.8)  to  be  established  for  all  bounded  n-monotone  functions  f  on  [0,«) .  The 
proof  for  unbounded  f  requires  the  removal  from  f  of  its  (possible)  linear  part 
and  then  a  truncation  argument. 

Suppose  first  that  f(s)  =  -cs,  s  >  0,  where  c  >  0.  Inequality  (2.8)  can 
be  expressed  as 


n 

<  l  El. ‘ 

1  1 


2  ? 

which  must  hold  since  (2.7)  implies  is  stochastically  smaller  than  27, 

i  =  1, . . . ,n. 

Now  suppose  f  is  any  unbounded  n-monotone  function  on  [0,  °°) »  i-e.  f  ('■')  = 
Since  f^  is  nonpositive  and  nondecreasing,  the  finite  nonpositive  limit 
f ^  (°°)  exists.  We  shall  assume,  without  loss  of  generality,  that  f^(^)  =  0. 
For  otherwise,  we  may  express  f  as  the  sum  of  two  n-monotone  functions, 
f  =  fj  +  E 2  'vhere  fJ^C00)  =  0  and  f2(s)  =  f ^  (*>)  •  s,  s  >  0,  and  treat  the 
parts  independently.  We  shall  truncate  f  as  follows:  Define  h  on  ((),«■)  by 
(2.16)  and  h(0)  =  (n-l)f(0).  For  x  >  0,  let  h  (v)  =  h(vAx),  v  >  0,  and  define 
fx  by  (see  (2.11)) 

r  nn_2 

fx(S)  =  rx  —  n -  hx(us)du  ,  S  >  0  .  (2.22) 

u 

Since  h  is  nonincreasing  on  (0,«)  (see  (2.17))  and 

h(0+)  =  (n-l)f(0+)  <  (n-l)f(0)  =  h(0)  (cf.  Lemma  2.5),  it  follows  that  h  is 
nonincreasing  on  [0,°°)  and,  consequently,  hx  is  a  bounded  nonincreasing 
function  on  [0,°°)  for  every  x  >  0.  This  implies  that  (2.8)  holds  for  each 
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function  f  ,  and  it  only  remains  to  show,  if  possible,  that  f  If  as  x  -*■  ■  (so 
X  x 

that  (2.8)  follows  for  f  itself  by  the  monotone  convergence  theorem).  Since  h^ 
is  nonincreasing  in  x,  so  is  f  (apparent  from  (2.22)),  and  thus  it  is  only 
necessary  to  show  the  pointwise  convergence  of  f  to  f. 

From  (2.22)  we  have  fx(0)  =  (n-l)h(fl)  =  f(0)  for  all  x  >  0.  Thus  we  may 
focus  our  attention  exclusively  on  points  s  >  0.  For  such  points,  it  is  more 
convenient  to  use  the  following  variant  of  (2.22)  (see  (2.12)): 

,  .n-2 

f  (s)  =  s  f  —  -  - -  h  (v)dv  ,  s  >  0  . 

x  Js  n  x 

v 

For  x  >  s  >  0,  we  have  (using  (2.18)) 


f 

x 


(s) 


s  /  - - -  h(v)dv  +  s  J  — - — -  dv  •  h(x) 

1  s  n  1  s  n 

V  v 


=  s 


/ 


x  (v-s) 


n-2 


s  n 
v 


,  n-1 


(-1)  rV  n-1  ,.(n-l) ,  , I  . 

rar),"  df  <u)  Uv 


(n-2):  's 

h(x) 


1  -  1 


,  ,,n-l  . 

Ill] _  fx  u71’1 

(n-1)!  !s 


('  -in-1 

1-- 
x 


(n-1)  ^  \  a; 

n-l'l 


.  Mm  -i 

n-1  (  "xj 


1-i 

u 


^f (n-i) ^ 


h(x) 

n-1 


n-l'l 

I1- 

1  s 
lI_*j 

) 

n-1  t  xj 


n- 1 


(-l)n_1  rx  ,  .n-1  ..(n-1),  .  h(x) 

T^TJT  Is  KU-S)  df  (U)  +  n^T- 


By  repeatedly  integrating  by  parts  (much  as  in  the  proof  of  Lemma  2.3),  we 
obtain 
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f  (s)  =  f(s)  +  l  f^k’)(x){xk  -  (x-s)k}  ,  x  >  s  >  0  . 

x  k=l  K' 

Since,  as  x  -*•  °°,  f  ^  (x)  -*■  f  ^  (°°)  =  0  (assumed  without  loss  of  generality), 
it  follows  from  Lemma  2.4  that  the  sum  converges  to  zero  as  x  +  ®.  Thus 
f  (s)  -*  f(s)  as  x  ->  °°,  which  completes  the  proof.  [I 


Theorem  2.2  can  be  extended  to  n-monotone  functions  on  (0,»),  to  allow  for 
functions  which  are  unbounded  at  zero  as  well  as  at  infinity. 

CoroKary  2.2.  If,  in  addition  to  (2.7)^  P(Z^=0, . . .  ,Zn=0)  =  0^  then 
(P(Zj=0, . . . , Z^=0)  =  0  and)  (2.8)  holds  for  each  n-monotone  function  f  on 
(0, °)  for  which  the  expectations  in  (2.8)  are  defined. 


Proof.  Let  f  be  n-monotone  on  (0,<»)  with  f(0+)  =  °°  and  f(°°)  =  -<».  (Functions 
f  with  smaller  ranges  can  be  handled  similarly  or  more  easily.)  Let  sQ  be  the 
zero  of  f,  f(sQ)  =  0,  and  for  each  k  >  (2/Sq)  define  f^(s)  =  f(s+p),  s  s  0. 
Then  each  is  n-monotone  on  [0,°°),  and  by  Theorem  2.2,  Ef^(||z||  )  > 

Also  as  k  t  oo,  t  f  on  (0,°=).  More  precisely,  t  f+  and  by  monotone 
convergence  Ef*(||z||2)  t  Ef+(||z|[2).  Also  for  s  >  sQ  -  ^  ,  since  0  <  -f^'j, 
we  have 


s+- 

0  <  f'(s)  -  f"(s)  <  -f(s+i)  +  f(s)  =  -/k  f(1)(u)du 

s 


s 


.fWfs 

1  (50  kJk  -  k 1 1 


and  thus  0  <  f“ (s)  -  f"(s)  <  ^-j f (1)  (-y)  | ,  s  >  0.  It  follows  that  Ef “ ( |  j Z  |  |  2) 
and  Ef"(||z||2)  are  finite  or  infinite  together  and  thus  Ef^(||Z||2)  4  Ef~(||Z|i 


(by  dominated  convergence  if  they  are  finite  or  trivially  if  they  are  infinite) . 
Since  Ef(||z||~)  is  defined  by  Ef+ ( | | Z | | ^) -Ef  (||z||^)  iff  at  least  one  of  the 
two  terms  if  finite,  (2.8)  follows. 

Some  examples  of  an  °°-monotone  function,  to  which  Corollary  2.2  is 
applicable,  are:  sa  (a  <  0),  -log  s. 

We  have  already  shown  that  the  2-monotone  functions  provide  the  appropriate 
class  for  the  result  of  Theorem  2.1.  The  following  example  shows  that 
3-monotone  functions  provide  the  appropriate  class  for  the  result  of 
Theorem  2.2  for  n  =  3  (by  constructing,  for  a  2-monotone  function  which  is  not 
3-monotone,  3-dimensional  random  vectors  Z  and  Z'  which  satisfy  (2.7)  and  for 
which  (2.8)  fails),  and  we  anticipate  that  similar  examples  would  show  the  same 
for  n  >  3. 

Example.  Suppose  f  is  2-monotone  but  not  3-monotone  on  [0,oo) .  Then  for  some 
a  and  b,  a  >  0,  b  >  0,  one  has 

f (a+3b)  -  3f (a+2b)  +  3f(a+b)  -  f(a)  >  0  .  (2.23) 

(Implicitly  we  are  saying  that  functions  f  which  are  2-monotone  and  satisfy  the 

converse  of  (2.23)  for  all  a  and  b  are  3-monotone,  which  can  be  verified.)  Let 
2  2  2 

3a  =  a  and  2a  +3  =  a  +  b  be  used  to  define  a  and  3  (0  <  a  <  S) ,  and  let  Z 

and  Z'  be  three-dimensional  random  vectors  whose  distributions  are  described  by 


the  following  table: 


P(Z'=z) 


From  the  last  two  columns  it  is  apparent  that  condition  (2.7)  holds.  Now  for 
R2  =  ZZ*  and  R'2  =  Z'Z we  have 


Ef(R2)  =  f (a)  +  11  f (a+b)  +  |i  f (a+2b)  +  — -  f(a+3b)  , 
Ef(R*2)  =  jj-  f (a)  +  11  f  (a+b)  +  15-  f(a+2b)  +  ||  f(a+3b)  . 


2  2 

From  (2.23)  it  follows  that  Ef(R'  )  >  Ef(R  ).  Consequently,  the  assumption  of 
3-monotonicity  in  Theorem  2.2  when  Z  and  Z'  are  three-dimensional  is  essential; 
it  is  impossible  to  consider  a  larger  class  of  functions.  [ 


3.  Expectation  inequalities  for  pairs  of  functions 

In  this  section  we  consider  random  vectors  Z  and  Z'  (i.e.  R  =  Fn)  which 
satisfy  (1.1)  with  A  the  class  of  all  principal  lower  and  upper  ideals  (-°°, z] 


•m*** 


m 
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and  [z,°°),  z  e  Rn.  When  n  =  1,  (1.1)  says  that  Z  and  Z'  have  the  same 
distribution,  which  is  not  interesting.  When  n  =  2,  (1.1)  says  that  Z  and  Z' 
have  the  same  marginal  distributions  and  that  their  bivariate  distribution 
functions  H  and  H'  satisfy  H  >  H' .  Our  attention  will  be  focused  on  the 
bivariate  case,  and  only  at  the  end  of  the  section  will  we  consider  a  higher 
dimensional  case. 

It  is  shown  in  [2,  12]  that  (1.1)  <=>  (1.2)  with  F^  the  class  of  all 
quasi-monotone  functions  (cf.  (1.14))  for  which  the  expectations  in  (1.2)  are 
defined  and  which  satisfy  certain  minor  regularity  conditions.  (See 
Theorem  1  in  [2].)  The  separation  approach  yields  (l.S.iii)  with  F^  defined  by 
(1.4)  as  the  class  of  all  pairs  of  functions  £,k  which  can  be  separated  by  a 
quasi-monotone  function  m:  £=m+f, k=m-g  where  f  and  g  are  nonnegative.  (Large 
classes  of  quasi -monotone  functions  are  known  or  can  be  constructed;  see  for 
instance  [2].)  When  £  and  k  are  separated  by  a  quasi-monotone  function  then 
they  satisfy  (1.15)  (cf.  (1.4)  and  (1.14)).  However  (1.15)  is  not  sufficient 
for  £  and  k  to  be  separated  by  a  quasi-monotone  function:  there  exist 
functions  £  and  k  satisfying  (1.15)  which  are  sufficiently  close  that  no 
quasi-monotone  function  can  exist  between  them.  This  is  easily  demonstrated 
with  the  aid  of  Figure  I. 

In  Figure  I,  relevant  values  of  k  and  £  are  indicated  at  various  points 
within  an  array  of  eight  points  possessing  a  particular  geometric  orientation 
in  the  plane.  In  order  to  obtain  a  contradiction,  it  is  assumed  that  a 
quasi-monotone  function  m  satisfying  £  >  m  >  k  does  exist.  Figure  I  indicates 
two  points  in  the  array  where  it  is  impossible  to  define  m  simultaneously.  In 
order  to  insure  that  (1.15)  is  satisfied,  it  is  sufficient  to  define  k  as  -JO, 
say,  and  £  as  10  at  all  points  in  the  plane  for  which  an  explicit  definition  i 
not  given  in  Figure  I. 
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(k=0)  o . . —  o  (£=0) 

i  i 

(m=3)  o - &?*?})- - 0  (£=0) 

I  »  ■ 

I  *  ' 

I  I  ■ 

(£=0)  o - o - o  (k=0) 

(m=a) 

FIGURE  I 

By  referring  to  the  four  points  in  the  lower  left-hand  corner  of  Figure  I,  one 
can  easily  deduce  from  £  >  m  >  k  and  the  quasi-monotonicity  of  m  that  the 
inequality  a  +  3  2  1  must  hold.  In  the  same  way,  one  can  obtain  the 
contradicting  inequalities  a  >  1  and  g  >  1  by  examining  the  four  points  in  the 
lower  right-  and  upper  left-hand  corners  of  Figure  I,  respectively.  Thus  no 
quasi-monotone  function  m  exists  which  satisfies  £  >  m  >  k.  This  example 
suggests  the  possibility  that  (1.1)  can  hold  without  (1.3)  holding  for  all 
functions  k  and  £  which  satisfy  (1.15)  (and  for  which  the  expectations  in  (1.3) 
are  defined).  In  fact,  an  example  based  upon  Figure  I  is  easily  constructed: 
Let  the  distribution  of  Z  assign  mass  1/3  to  each  of  the  points  in  Figure  I  at 
which  £  =  0,  and  let  the  distribution  of  Z'  assign  mass  1/3  to  each  of  the 
three  points  in  Figure  I  at  which  k  is  explicitly  defined.  It  is  easily 
checked  that  (1.1)  holds  but  E£(Z)  <  Ek(Z’). 

Thus  if  Z  and  Z’  are  bivariate  random  variables  with  equal  marginal 
distributions  and  with  bivariate  distribution  functions  satisfying  H  >  Hr  (i.e. 
if  (1.1)  holds),  then  in  general  this  does  not  imply  that  (1.3)  holds  for  all 
functions  k  and  £  satisfying  (1.15).  We  now  show  that  this  implication  is  true 
in  certain  special  cases,  using  the  variation  of  the  surrogate  approach 
involving  condition  (CO),  which  is  described  in  the  introduction.  We  begin 


with  the  normal  case. 
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Theorem  3.1.  Suppose  Z  and  Z'  are  bivariate  normal  random  variables  with 
cornnon  means  and  variances  and  with  correlation  coefficients  p  and  p' 
satisfying  the  inequality  p  >  p'.  Then  (1.3)  holds  for  every  pair  of  functions 
k  and  1  which  satisfy  (1.15)  and  for  which  the  exf>ectations  appearing  therein 
make  sense. 


Proof.  Let  (S,T,U)  be  normally  distributed  with  a  zero  mean  vector  and  the 
covariance  matrix 


(S,T,U) 


1 

P 

(p'-p) 


P 

1 

(p'-p) 


(p'-p) 

(p’-p) 

2(P-P') 


Define 


Xrv°xs » Yry°yT  *  wox(s+u)  *  Y2=y°y(T+u) 


2  2 

where  (u  ,  u  )  is  the  common  mean  vector,  and  o  and  a  are  the  common  variances 
X  y  x  y 

of  H  and  H'.  It  is  easily  checked  that  (X^.Y^)  and  (X^.Y^)  have  distribution 

function  H  and  that  (Xj,Y2)  and  (X-^Yj)  have  distribution  function  H'. 

2 

Moreover  (X2_X1^Y2~Y1^  =  °x°y U  Thus  condition  (CO)  is  satisfied  and 

(1.3)  follows  from  (1.15),  via  (1.16)  as  discussed  in  the  introduction.  0 


Thus  when  2  and  2*  are  as  in  Theorem  3.1,  and  is  the  class  of  all 
pairs  of  functions  £  and  k  satisfying  (1.15)  and  for  which  the  expectations  in 

(1.3)  are  defined,  we  have  p  >  p'  ~=>  (1.3).  On  the  other  hand  we  clearly  have 

(1.3)  =^>  (1.2)  and,  as  was  already  mentioned,  (1.2)  <-■>  (1.1)  which  in  this 
case  is  equivalent  to  H  >  H'.  Thus  p  >  p'  =>  H  >  H',  which  is  a  special  case 
of  an  n-dimensional  result  due  to  Slepian  [9],  and  which  implies  p  >  p'  <=~>  H  ?-•  li 
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It  then  follows  that  when  Z  and  Z'  are  bivariate  normal  variables  with  common 
means  and  variances  then 

p  >  p'  <  >  H  >  H'  <  >  (1.2)  <  >  (1.3)  . 

Theorem  3.1  can  be  extended  to  higher  dimensions  and  from  normal  to 
elliptically  contoured  distributions.  If  Z  is  an  n-dimensional  random  (row) 
vector  and,  for  some  n  (row)  vector  y  and  some  nxn  nonnegative  definite  matrix 

Z,  the  characteristic  function  (s)  of  Z-y  is  a  function  of  the  quadratic 

Z-y 

form  sZs*,  <f>  (s)  =  ({((sZs*),  we  say  that  Z  has  an  elliptically  contoured 

Z-y 

distribution  with  parameters  y,  Z  and  <j>,  and  we  write  Z  ~  ECn(y,Z,4>).  When 
(j) (u)  =  exp(-u/2),  ECn(y,Z,<J))  is  the  normal  distribution  Nn(y,Z).  The  location 
and  scale  parameters  y  and  Z  can  be  any  n  vector  and  any  nxn  nonnegative 

definite  matrix,  while  the  class  of  admissible  functions  <J)  depends  on  the 

rank  k  of  Z,  r(Z)  =  k,  and  consists  of  all  functions  of  the  form 


4>(u)  =  /  Sl(r2  u)dF (r)  ,  u  >  0  , 

[0,-) 

i  i  i  i  2  k 

for  some  distribution  function  F  on  [0,°°),  where  ^(  I  I s  |  |  ),  s  c  1R  ,  is  the 
characteristic  function  of  the  uniform  distribution  on  the  surface  of  the  unit 

k 

sphere  of  R  .  This  follows  from  a  theorem  of  Schoenberg  [8]  and  is  discussed 
in  [1]  where  the  following  useful  stochastic  representation  is  also  introduced 
Let  Z  =  At  A  be  a  rank  factorization  of  Z,  i.e.  A  is  kxn  and  r(Z)  =  k  =  r(A). 
Then  Z  has  the  "canonical"  stochastic  representation 


A 


where  the  equality  is  in  distribution,  R  is  a  nonnegative  random  variable  (with 
(k) 

distribution  F),  U  J  is  a  k-dimensional  random  vector  uniformly  distributed  on 
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k  f  k) 

the  surface  of  the  unit  sphere  in  R  ,  and  R  and  U  are  independent. 
Theorem  5.2.  Suppose  that  1  ~  EC^Cu, >'.,<}>)  and  Z’  ~  EC^Cvi, ?.'  ,$)  where 


2 

f  2  , 

ol  pa1a2 

al  P  °la2 

2 

,  2 

p0l°2  °2 

tp  al°2  °2 

and  p  >  p'.  Then  (1.3)  holds  for  every  pair  of  functions  k  and  l  which  satisfy 
(1.15)  and  for  which  the  expectations  appearing  in  (1.3)  are  defined. 

Proof.  If  p  =  1  and  p'  =  -1  we  have 

Z=y+RU(1'1A1  ,  Z'=y+RU(1)AJ 

where  A^OpC^)  ,  A|=(o1,-o2).  Since  R  and  U^1-*  are  independent,  in  order  to 
show  E£(Z)  >  Ek(Z')  it  suffices  to  show 

E£(y+rU(1^A1)  >  Ek(y+rU(1)Ap  ,  r  >  0  .  (3.1) 

Since  k(*)  and  £(•)  satisfy  (1.15),  so  do  k(y+r*)  and  £(y+r*)  for  every  y  and 
r  >  0.  Thus  it  suffices  to  show  (3.1)  when  y  =  0  and  r  =  1,  i.e.  it  suffices 
to  show  ESl(U^Ap  >  Ek(U^Ap,  which  is  written  as 

yU-c^.-ap  +  y£(o1,o2)  >  yk(-a1,o2)  +^k(a1,-a2) 
and  follows  from  (1.15). 

Now  assume  that  at  least  one  of  p,p'  differs  from  1  in  absolute  value. 
Putting 
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°1 

cos 

a 

a2 

sin 

a 

and  A'  = 

f 

°1 

cos  a' 

°2 

sin  a' 1 

9 

sin 

a 

(N 

C 

cos 

a 

°1 

sin  a' 

°2 

cos  a'J 

where  «  and  a',  j  <  a'  •  a  ■ 
have  T.  =  A*  A  and  X'  =  A'*' A' 


are  defined  by  p  =  sin  2a  and  o'  =  sin  2a',  we 
When  -1  <  p'  <  p  <  1  then  both  1,7'  are  full 


rank  and  r(A)  =  2  =  r(A')  so  that  X=AtA  ,  X*=  A^A*  are  rank  factorizations  of 
£,Z'.  It  then  follows  that 


z^u+ru^-’a  ,  z,=u+ru(2-)ai 


(3.2) 


When  one,  but  not  both,  of  p,p'  equals  1  in  absolute  value,  say  -1  <  p'  <  p  =  1, 

then  Z'  =  vi  +  RU(2V  and  =  =  /  n  (r2sZ'st)dF(r) 

[O.c) 

where  F  is  a  distribution  of  R.  Since 


Eeis(Z-u)  =  =  j  q  (r2syst)dF(r) 

[0,oo) 

it  is  easily  checked  that  Z  =  u  +  RU^A.  Hence  (3.2)  holds  provided  at  least 
one  of  p,p'  differs  from  1  in  absolute  value.  Because  of  the  independence  of 
R,U^2\  arguing  as  before,  it  suffices  to  show  that  E£(l/2^A)  >  Ek(U  2^A'). 
This  will  be  done  by  defining  a  random  vector  (Xj.X^Yj.Yj)  which  satisfies 
condition  (CO};  (X^Yj)  and  (X2,Y2)  will  be  distributed  as  U^A,  (Xj,Y2)  and 
(X2,Yi)  will  be  distributed  as  U^A',  and  the  product  (X^-Xj)  (Y^-Yj)  will  be 
nonnegative. 

(2) 

The  random  vector  U  can  be  taken  to  be  (sin  8, cos  8),  where  9  is 

uniformly  distributed  on  any  interval  of  length  2-rr.  Then 

U(2)A  =  (Oj  sin(9+a)  ,  a2  cos(8-a))  ,  U^A'  =  (<jj  sin(0+a')  ,  o2  cos(e-a'))  . 


(3.3) 
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Let 

S=sin(0+a)  ,  T=cos(0-a)  ,  V=sin(a'-0)  , 

and  define 

Xl=alS  ’  X2=ai(s+2  cos(ct+a')V)  ,  Y^o^T  ,  Y2=o2(T+2  sin(a-a')V)  . 

Since  2  sin(a-a')cos(a+a')  =  sin  2a  -  sin  2a'  =  p  -  p'  >  0,  it  follows  that 
^X2~X1^Y2-YP  -  °*  ®y  further  trigonometric  manipulations,  one  obtains 

S  +  2  cos(a+a')V  =  sin( (2a1 -0)+a)  ,  T+2  sin(a-a')V  =  cos ( (2a' -9) -a)  . 

Since  2a' -0  is  uniformly  distributed  on  an  interval  of  length  2rr,  (X9,Y0),  as 

(21 

well  as  (Xj.Yj),  is  distributed  as  tr  JA.  Similarly,  one  finds  that  (X^.Y,,) 

(21 

and  (X2,Yj)  are  distributed  as  U  A*.  This  completes  the  proof. 


Remark .  Implicit  in  this  proof  is  the  use  of  a  fact  about  any  two  ellipses 

which  are  inscribed  in  the  same  rectangle.  Each  point  on  one  of  the  ellipses 

is  the  vertex  of  a  rectangle  whose  opposite  vertex  is  on  the  same  ellipse  and 

whose  adjacent  vertices  are  on  the  other  ellipse.  (In  fact  there  are  two  such 

rectangles.)  Whether  this  is  a  known  fact  from  projective  geometry  is  unknown 

to  the  authors.  (In  the  present  context,  the  two  ellipses  are  the  ranges  of 

(2)  (2) 

the  random  vectors  U  A  and  Uv  A'.) 


When  k  and  i  are  functions  of  x  +  y. 

Suppose  k  and  £,  which  are  defined  for  (x,y)  c  K",  are  functions  of  the 
sum  x+y.  For  convenience,  we  shall  write  them  as  k(x+y)  and  £(x+y).  In  terms 
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1,  ,  1,  ,1,  l 

u  =  y(x1+x2+y1+y2)  ,  v  =  ^(x^Xj+y^y^  ,  w  =  x2-x1+y1-y2  [  , 

the  inequalities  (1.15)  become 

9,  (u+v)  +  £(u-v)  >  k(u+w)  +  k(u-w)  ,  u  c  ]R  ,  0  <  w  <  v  <  r»  .  (3.4) 

When  k  =  l,  this  condition  is  equivalent  to  convexity.  Thus  k(x+y)  is 
quasi-monotone  (as  a  function  of  (x,y))  if  and  only  if  it  is  a  convex  function 
(of  the  single  variable  t  =  x  +  y) . 

In  general,  k  can  be  no  larger  than  k*  defined  by 

k*(u)  =  inf{£(u+v)  +  £(u-v)}  .  (3.5) 

v>0 

(Set  w  =  0  in  (3.4).)  Assuming  that  k*  is  finite  and  measurable,  k  can  equal 
k*  if  and  only  if 

inf { V.(u+w+a)  +  £(u+w-a)  }  +  inf  {£(u-w+tx)  +  9 (u-w-a)  }  <  2(2, (u+v)  +  ?(u-v)t 
(x>0  a>0 

for  all  ueF,  0<w<v<°°.  This  condition  is  met  if  for  each  u  e  F,  w  :  0 
and  v  >  w,  there  exists  an  a  s  0  such  that 

fu(w+a)  +  fu(w-a)  <  2fu(v)  ,  (3.6) 

where  f  is  the  even  function  defined  by  fu(v)  =  £(u+v)  +  £(u-v).  A  suitable 
value  for  a  can  be  obtained  if  f  is  nondecreasing  (set  a  =  0),  nonincreasing 
(set  a  =  v  +  w),  or  concave  (set  a  =  v)  on  [0,<») .  Summarizing,  if  k*  is  finite 
and  measurable,  and  if  for  each  u,  £(u+v)+£(u-v)  is  a  nondecreasing, 
nonincreasing  or  concave  function  of  v  on  [0,«>),  then  k*  is  a  suitable  and 
maximal  choice  for  k.  Under  these  conditions  on  k*  and  l,  any  measurable 
function  k  s  k*  satisfies  (3.4). 
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The  following  are  examples  of  functions  £  which  meet  these  conditions: 

(i)  If  £  is  convex  (as  a  function  of  the  single  variable  t  =  x  +  y) ,  then 
£(u+v)+£(u-v)  is  a  nondecreasing  function  of  v  on  [0,°°)  and  k*  =  £. 

2 

(ii)  If  £  is  a  symmetric  function  with  respect  to  some  point  (uQ,  £-0)  in  1R 
(i.e.,  if  £(uQ+v)  +  £(uQ-v)  =  2£Q,  v  >  0) ,  and  £  is  concave  on  [uo,«0, 
then  £(u+v)  +£(u-v)  is  a  nonincreasing  function  of  v  >  0  for  u  >  u^  and  is 
a  nondecreasing  function  of  v  >  0  for  u  <  uQ.  It  follows  that  k*  =  X 
on  ( -00 ,  uq ]  . 

The  values  of  k*  on  (u^,00)  must  be  evaluated  from  (3.5)  in  each  specific 

case,  and  it  must  be  checked  whether  k*  is  finite  and  measurable.  For 

instance,  if  £  is  the  distribution  function  of  a  normal  random  variable  with 

2  1 

mean  p  and  variance  a  ,  then  k*  =  2.  on  (-“>,w]  and  k*  =  j  on  [p,°°);  hence  k* 
is  a  maximal  choice  for  k.  (This  is  another  example  of  functions  k  and  £ 
satisfying  (1.15)  which  can  not  be  separated  by  a  quasi -monotone  function  m, 
i.e.,  there  is  no  function  m  which  satisfies  (1.14)  and  k  <  m  <  Z.  The 
details  are  left  to  the  reader.) 

When  the  supports  of  X+Y  and  X'+Y'  are  not  the  entire  real  line,  the 

ranges  of  u,  v  and  w  in  (3.6)  can  be  reduced.  Thus  k  and  £  will  have  to 

satisfy  fewer  restrictions,  and  a  wider  variety  of  suitable  k,£  pairs  will  be 

permitted.  (The  same  can  be  said  when  k  and  £  are  general  functions  on  IR*'  and 

2 

the  supports  of  (X,Y)  and  (X',Y*)  are  proper  subsets  of  R  .) 

When  k  and  £  are  functions  of  x-y  (instead  of  x+y)  the  analogue  of  (5.41 
is 

£(u+w)  +  £(u-w)  >  k(u+v)  +  k(u-v)  ,  ueR,  0swsv<^.  (5.“) 

(the  definitions  of  u,  v  and  w  must  be  modified  appropriately)  and  the  analysis 
of  (3.7)  is  similar  to  that  given  for  (3.4). 
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A  generalization  to  higher  dimensions. 

A  higher  dimensional  version  of  Theorem  3.2  can  be  proven  by  reducing 
the  dimension  to  2  through  conditioning. 


Theorem  3,3.  Suppose  that  2  ~  ECn(p,E,<J>)  and  Z'  ~  Ecn (u, ,4>)  where 

^^Jij)  »  £'  =  (ujj)  »  °ii=aii  >  aij-aij  >  i  *  j .  Then  (1.3)  holds  for  any  pair  of 
functions  k  and  £  of  n  variables  for  which  the  expectations  appearing  in  (1.3) 
make  sense  and  which  satisfy  (1.15)  as  functions  of  any  two  of  the  n  variables 
for  all  fixed  values  of  the  remaining  n-2  variables . 

Proof.  According  to  the  argument  in  the  first  paragraph  of  the  proof  of 
Theorem  5.1  in  [3],  it  suffices  to  prove  the  result  in  the  case  where  >  cj , 
and  a..  =  a!^  for  all  (i,j)  *  (1,2), (2,1).  Write 


[T.  T. 

Z=(Z  ,2  )  ,  u-(u  y  )  ,  E«  11  r12 

l21  h22 


where  are  two-dimensional  and  E  is  2x2.  It  is  easily  seen  via  the 

characteristic  function  that 


(YrY2)  =  (Z1-y1  ,  Z2-y2)  + 


I  0) 


~^22  X21  1 


fy*  n  ) 

FC  0 

"  ’  0  E22 1 


where  Z+  is  the  self-adjoint  generalized  inverse  of  E-„  and  T*  =  L  -  T,  T+  ? 

22  *11  12  22  21' 


Now  let  RU^  ^  =  R(U1,U2)  ~  ECn(0,I,<f>),  where  is  two-dimensional.  Then 


prJ.YJ)iR(u1>0j,fj*yJ.R(Ui£.^vJ2, 


and 
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(Z 


)  ^i  +  RU2Z22Z22E12+RU2L*  ' 


1*“2 


U2+RU2Z2V 


Since  is  uniformly  distributed  on  the  surface  of  the  n-dimensional  unit 

sphere,  (UjlI^u^  =  [l-u^^];  U(2)  where  [a]+  =  max(a.O).  (See  for  instance 

Lemma  3  m  [1].)  Since  R  and  V  are  independent,  it  follows  that  for  all  r  0 
and  u7. 


C(Z1,Z2)|R=r  ,  U^=u2)  =  (jj*+r*U 


(2). 


•'2+ru2‘  22J 


(3.8) 


where 


l+TU2Z22Z22h12  ’  r*=rf1-u2u2^+  ’ 


11  i; 


>2  21 


,  Z’  =  (Z',Z')  satisfies  (3.8) 

F*'  =  J:il  “  Z12Z22Z21'  In  order  to  verify  E?(Z)  > 
show  that  for  all  r  >  0  and  u2. 


Since  £'  = 


'Zll  Z12 
Z21  Z22 


with  ;/* '  =  u*,  r*'  =  r*  and 
Ek(Z')  it  thus  suffices  to 


E£(p*+r*U('2^!*55 


,  p2+ru2Z^2)  >  a(p*+r*U(2)E*'!:,  P2+ru2"^2) 


and  this  follows  from  Theorem  3.2. 


Remarks.  1.  By  letting  k  and  £  in  Theorem  3.3  be  indicator  functions  of 
z],  z  e  Rn,  one  obtains  the  inequality  H  >  H*  where  H  and  H1  are  the 
distribution  functions  of  Z  and  1'  respectively.  For  normal  distributions  this 
inequality  is  a  well-known  result  due  to  SI cpi an  [9],  and  for  elliptical ly 
contoured  distributions  it  has  been  obtained  by  Das  Gupta  et  al .  [3]  under  the 
assumptions  that  the  matrices^  are  invertible  and  that  densities  exist. 
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2.  Under  the  assumptions  of  Theorem  3.3  it  is  also  true  that 
P(Z  A)  i’iZ'-  V I  for  every  set  of  the  form  fuclRn,  u  ■:},  z  f  Rn.  Thus  all 
resu't  ,  in  ^  "  section  are  concerned  with  random  vectors  that  are 

sto.  hast  it.ii  >  i\U  rrd  in  the  sense  described  in  the  beginning  of  this  section. 
(I'h.s  observation  may  point  the  way  to  extensions  that  are  not  confined  to 
liuc  l  idean-space-va  lued  random  variables.) 

3.  The  approach  used  in  proving  Theorem  3.3  can  be  used  to  extend 

the  theorem  to  random  vectors  Z  and  Z'  with  more  general  distributions  than 

elliptically  contoured.  For  instance  the  theorem  holds  for  random  variables 

with  distributions  Z  =  p  +  U^AR  and  Z'  =  p  +  U^A'  R,  where  R  is  any  random 

00 

matrix  with  nonnegative  components  which  is  independent  of  U  . 

4 .  Necessary  and  sufficient  conditions  for  Cl .3) 

It  is  possible  to  characterize  the  bivariate  distribution  functions  H  and 
H'  which  satisfy  condition  (CO).  This  is  accomplished  by  a  straightforward 
generalization  of  the  proof  given  by  Sudakov  [11]  of  a  theorem  by  Strassen  jlPj. 
Cast  in  our  context  this  slight  extension  of  Strassen' s  theorem  says  that  (Co) 
is  equivalent  to  the  following  condition  (Cl): 

(Cl)  P(ZeA)  >  P(Z'eA')  ,  (A, A')  e  A£  =  { (A, A') :  11  satisfy  (1.15)  . 

Thus  if  (C3)  denotes  the  following, 

(C3)  ES.(Z)  >  Ek(Z')  ,  (£,k)  e  F ^  the  class  of  all  functions  £,k  which 

satisfy  (1.15)  and  for  which  the  expectations  are 
defined  , 

we  have  that  conditions  (CO),  (Cl)  and  (C3)  are  equivalent.  The  implications 
(CO)  =>  (C3)  and  (C3)  =>  (Cl)  are  immediate,  and  the  nontrivial  ones  are 
(Cl)  ->  (C3)  =>  (CO). 
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Two  indicator  functions  k  =  1.  and  £  =  1.  satisfy  (1.15)  if  and  only  if 

A  B 

2 

A  c  B  and  for  every  rectangular  set  of  four  points  in  R 

d  c 


the  number  of  points  of  {a,c}  in  B  is  larger  than  or  equal  to  the  number  of 
points  of  (b,d)  in  A.  By  choosing  various  such  pairs  of  sets  A  and  B  one  finds 
that  the  inequality  in  (C3)  requires  H  and  H'  to  have  common  marginal 
distributions  and  to  satisfy  H  >  H'  .  In  the  case  of  normal  or  elliptically 
contoured  distributions  H  and  H',  these  two  properties  imply  (C3)  as  shown  in 
Theorems  3.1  and  3.2.  The  case  of  more  general  distributions  requires  further 
investigation. 

Because  of  the  equivalence  of  conditions  (CO),  (Cl)  and  (C3) ,  our  goal, 
i.e.  condition  (C3) ,  can  be  achieved  by  establishing  either  condition  (CO)  or 
(Cl).  In  Theorems  3.1  and  3.2  condition  (CO)  is  established.  Even  though 
condition  (Cl)  is  very  natural  and  satisfactory,  especially  in  its  relationship 
with  (C3),  it  is  unfortunately  very  difficult  to  verify,  and  we  have  failed  to 
achieve  this  for  any  distributions  H  and  H'. 

The  equivalence  of  (CO),  (Cl)  and  (C3)  is  a  special  case  of  a  more 
general  situation  which  provides  new  ways  of  obtaining  inequalities  of  the  type 
(1.3),  and  of  showing  the  existence  of  joint  distributions  with  fixed  support 
and  with  certain  marginals  fixed.  To  illustrate  the  power  and  novelty  of  this 
approach  let  us  consider  a  few  examples. 

4 

Let  F  be  a  closed  subset  of  R  ,  and  consider  an  inequality  between 
functions  k  and  £  of  the  following  type  (simpler  than  (1.15)) 


H 

I 
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(CF)  £(x1,y1)  >  k(x2,y2)  ,  (xx ,x2>y1 ,y2)  e  F  , 

and  the  following  conditions  which  depend  on  F. 

(COF)  There  exists  a  random  vector  (X^X^Y^Yp  whose  values  are  in  the  set 
F  and  which  is  such  that  the  bivariate  marginal  distributions  of  (X^Yj) 
and  (X2,Y2)  are  H  and  H'  respectively. 

(C1F)  P(ZeA)  >  P(Z'eA')  ,  (A, A')  e  A2  =  {(A, A'):  1A,1A,  satisfy  (CF) }  . 

(C3F)  E£(Z)  >  Ek(Z')  ,  (£,k)  e  F2,  the  class  of  all  functions  l,k  which 

satisfy  (CF)  and  for  which  the  expectations  are 
defined. 

By  Strassen's  theorem,  (COF)  and  (C1F)  are  equivalent.  Also,  if  (COF)  holds  we 
have 


UXpYp  >  k(X2,Y2)  a.s. 

for  all  pairs  k  and  l  satisfying  (CF) ,  and  by  taking  expectations  (C3F)  follows. 
Thus  (COF)  =>  (C3F)  and  clearly  (C3F)  =>  (C1F) .  Hence  conditions  (COF),  (C1F) 
and  (C3F)  are  equivalent .  By  choosing 

F  =  {(x1,x2,y1,y2):  ’  yl~y2*  , 

(CF)  becomes  JKXj.yj)  >  k(x2>y2),  Xj  >  x2>  yj  >  y2;  (C1F)  is  equivalent  to 

H(I)  >  H *  Cl)  for  all  increasing  sets  I  ; 

and  the  result  includes  Theorem  l(i),  (iv),  and  (vi)  of  Kamae,  Krengel  and 
O’Brien  [4].  Of  course,  any  one  or  both  of  the  inequalities  in  the 
definition  of  F  could  be  reversed  with  corresponding  results.  If  we  choose 


‘in- 
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F  =  {(x1,x2,y1,y2) :  max( jx1 j , |yx | )  <  max( |x2 | , |y2 |)  }  , 
then  (C1F)  becomes  equivalent  to 

H(A)  >  H' (A)  for  all  squares  A  =  {(x,y) :  |x  j  <  a  ,  |y  j  <  a}  ,  (4.1) 

and  thus  (4.1),  (COF)  and  (C3F)  are  equivalent.  When  H  and  H'  are  normal  with 
zero  means,  common  variances  and  correlation  coefficients  p  and  p'  satisfying 
( p |  >  jp'j,  then  (4.1)  is  satisfied  and  thus  (COF)  and  (C3F)  hold,  both  new 
results.  The  same  is  true  for  absolutely  continuous  elliptically  contoured 
distributions  EC2(0,£,<j>)  and  EC2 (0, Z' , cj>) ,  where  £,E'  are  as  in  Theorem  3.2  with 
| p J  >  | p ’ | ;  in  this  case  inequality  (4.1)  follows  from  Theorem  2.1  of  [3] . 

For  not  necessarily  absolutely  continuous  elliptically  contoured 
distributions  EC2(0,E,4>)  and  EC2(0,E' ,$) ,  where  E,Z'  are  as  in  Theorem  3.1  with 
| p  J  >  ]  p  *  | ,  we  now  give  a  simple  proof  of  (C3F) ,  and  thus  also  of  (4.1),  in  the 
case  where  the  common  variances  of  E,E'  are  equal.  The  approach  is  through  a 
construction  similar  to  that  of  Theorem  3.2  and  thus  the  result  is  obtained 
without  using  Strassen's  theorem.  Also  the  result  is  slightly  stronger  than 
that  in  the  previous  paragraph  in  that  the  elliptically  contoured  distributions 
are  not  required  to  be  absolutely  continuous.  Even  though  we  are  assuming  for 
simplicity  of  the  construction  that  the  common  variances  of  E,E'  are  equal,  no 
doubt  a  similar,  but  somewhat  more  involved,  construction  would  be  feasible 
when  the  variances  are  not  equal. 

Theorem  4.1.  Suppose  that  Z  ~  EC2 (0 , E , <h)  and  V  ~  E C2(0,£',<J>)  where 
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and  IpI  z  jp’ I.  Then  (C3F)  holds ,  as  well  as  (4.1), 
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Proof.  As  in  the  proof  of  Theorem  3.2  it  suffices  to  show  E£(U^“^A)  Ek(lJ<<~^A' 

(here  p  =  0) .  This  will  be  done  by  defining  a  random  vector  (X. ,X0>Y, ,Y0) 

which  satisfies  condition  (COF) ;  i.e.  (Xj.Y^)  =  U^A,  (X2>Y2)  =  U'^A',  and 

max ( | X j | , | Y j | )  <  max( |X2 | , |Y2 |) . 

(21  (21 

The  random  vectors  U  and  U*  can  be  taken  to  be  (cos  0  ,  sin  0)  and 
(cos  O',  sin  O'),  where  0  and  O'  arc  uniformly  distributed  on  intervals  of 
length  2tt.  Then  we  can  take 

(Xj.Yj^afcosfe-ot)  ,  sin(0+a))  ,  (X2,Y2)=a(cos(0'-ct')  ,  sin(6'+a')) 

where  sin  2a=p  ,  sin  2a' =p',  -  ^  <  a,  a'  <  .  We  will  now  determine  the  joint 

distribution  of  (0,0')  so  that  the  marginals  will  be  uniform  on  intervals  of 
length  2-rr  and 

max{  |cos(6-cx)  |  ,  |sin(6+a)|}  <  max{ |cos(0' -a’ } |  ,  |  s  in  ( 6  * + a  * ) } }  .  (4.2) 

We  have 

(cos(0'-a')|  s  (sin(0'+a')j  <=->  cos  2(0'-a')  >  -cos  2(9'+a') 

<=>  cos  20'  cos  2a'  >  0  <=>  cos  26'  >  0 

and  similarly 

|cos(8-a) |  <  |cos(0'-a')|  <=>  sin(e+8'-Y)sin(0-0’-6)  s  0 

|sin(0+a)|  s  |cos(0'-a')j  <=>  cos(0+0'+0)cos(0-0'+y)  >  0 

|cos(0-a) |  <  jsin(0'+a')i  <=>  cos(0+e,-B)cos(0-0,-Y)  s  0 

|sin(0+a)|  s  |sin(0'+a')|  <=>  sin(0+0'+Y)sin(0-0'+0)  £  0 

where  0  =  a  -  a',  y  =  a  +  a'  (0  <  M  0  <  y  ^).  Thus  (4.2)  is  equivalent 
to 
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cos  20'  >  0  Jcos  20'  £  0 

■  sin(0+0'-Y)sin(0-0'-6)  >  0  ■  or  ■ cos(8+0'-8)cos(0-0'-y)  <  0  -  .  (4.3) 

cos(0+O'+3)cos(O-O'+y)  £  0  sin(0+0'+Y)sin(O-G'+B)  £  0 

The  two  sets  of  inequalities  in  (4.3)  determine  the  set  where  the  support  of 
the  joint  distribution  of  (0,0')  must  lie. 

7T 

Let  us  first  consider  the  case  where  0  <  p'  <  p,  i.e.  0  <  a'  <  a  <  . 

When  a  =  a' ,  i.e.  3  =  0,  we  can  take  0=0'.  In  the  general  case,  3  >  0,  we 
can  take  9  to  be  the  following  function  of  0': 


0  =  0'  +  3  for  -Js0'<J-B,  <  6'  <  -  3 

=  0*  +  0  -  \  for  J  -  3  5  0'  <  J  ,  ~  -  B  <  0’  <  •— 

=  9’  -  3  for  J  +  B  <  9'  <  ~  ,  +  g  <  9.  < 

=  0'  -  3  +  j  for  J  <  0'  <  }  +  B  ,  ^  <  0'  <  —  +  3  • 


(4.4) 


Since  the  relationship  between  0  and  0'  in  [-  — ■  is  one-to-one  and  piecewise 

linear  with  slope  1,  if  0'  is  uniformly  distributed  on  [-  j,*j-)  ,  then  so  is  6. 
Moreover  the  pairs  (0,0')  defined  by  (4.4)  satisfy  conditions  (4.3)  and  we  now 
check  this  for  <0'  <  in  which  case  cos  20'  >  0  (the  remaining  cases 
being  similar).  When  -j  <  0'<j-8,  we  have  0  =  0'  +  3  and  thus 


sin(0+@'-Y)sin(0-0'-B)=O>O  ,  cos(0+0'  +  B)cos(0-0'+y)=cos  20  cos  2a  2 0  . 
When  j  -  0  <  0'  <  J  we  have  0  =  0'  +  0  -  j  and  thus 

sin(0+0’-Y)sin(0-0’-6)  =  sin (20 ’  +  3-  j  -y)sin(-  ~  )  =  cos  2(8'-a’)  2-  0 
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since  0  O’  -  a1  <  j ,  and 

cos(0+0'  +  P) cos(0-0'+y)  =  cos (20  +  ~)cos(p,+y-  ^ )  =  -sin  20  sin  2y  v  0 

rr  _  a  it 
since  -  -r  <  0  <  -  q-  . 

4  8 

The  remaining  cases  are  treated  similarly,  and  Figure  II  shows  the  graph 
of  0  =  f(O')  which  achieves  the  desired  properties.  The  graph  is  plotted  for 
_1  <  e1  <  ip,  and  for  ip  <  6  <  ^p  the  graph  is  obtained  by  shifting  the  plotted 
graph  by  (tt.tt)  .  U 

If  we  choose 

F  =  {(xrx2,yi,y2):  |xj  |  >  |x2|  or  jy2 1  >  |y2|} 
then  (CIF)  becomes  equivalent  to 

H(AC)  >  H'(AC),  i.e.  H(A)  s  H'(A),  for  all  rectangles  A  =  {(x,y):  Jx[<a,  [  y  |  <  b }  , 

(4.5) 

and  the  corresponding  conditions  (COF),  (C3F)  and  (4.5)  are  equivalent.  When 
H  and  H'  are  absolutely  continuous  elliptically  contoured  distributions 
CC2(0, E,4>)  and  EC2(0, Z ' , <p) ,  where  E.E'  are  as  in  Theorem  3.2  with  (p|  <  |p'|, 
then  (4.5)  is  Theorem  2.1  of  [3],  and  thus  (COF)  and  (C3F)  both  hold. 

If  we  choose 


F  =  {(Xj.X2.yj.y2):  (Xj.yj)  >m  (x2,y2)}  , 

where  for  two-dimensional  vectors  (Xj.yj)  >m  (x2,y2)  means  max (Xj.yj)  >  max(x2,y2) 
and  xi  +  =  x2  +  ^2’  then  (C1F)  becomes  equivalent  to 


H(A)  >  H'(A)  for  all  measurable  Schur-convex  sets  A 


(4.6) 


(A  is  Schur-convex  if  Z  e  A  and  Z*  -m  Z  imply  Z'  <.  A)  and  the  corresponding 
conditions  (COF),  (C3F)  and  (4.6)  are  equivalent.  This  is  Theorem  2.2  in  [6). 
All  the  examples  described  above  have  obvious  n-dimensional  analogues. 
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